iT邦幫忙

2022 iThome 鐵人賽

DAY 11
0
DevOps

那些文件沒告訴你的 AWS EKS系列 第 11

[11] 為什麼 Managed node groups 可以保持應用程式可用性(二)

  • 分享至 

  • xImage
  •  

延續前一篇,本文將繼續探討「node 更新過程 managed node group 協助了什麼 」。

建置測試步驟

  1. 透過 kubectl label 過濾 managed node group demo-ng
$ kubectl get node -l eks.amazonaws.com/nodegroup=demo-ng
NAME                                           STATUS   ROLES    AGE     VERSION
ip-192-168-18-230.eu-west-1.compute.internal   Ready    <none>   9m30s   v1.22.12-eks-ba74326
  1. 查看在此 node 所運行的 Pod。
$ kubectl describe no ip-192-168-18-230.eu-west-1.compute.internal | grep -A 6 "Non-terminated Pods"
Non-terminated Pods:          (4 in total)
  Namespace                   Name                                      CPU Requests  CPU Limits  Memory Requests  Memory Limits  Age
  ---------                   ----                                      ------------  ----------  ---------------  -------------  ---
  demo                        nginx-demo-deployment-7587998bf4-dkd5t    0 (0%)        0 (0%)      0 (0%)           0 (0%)         8m27s
  kube-system                 aws-node-j2c7p                            25m (1%)      0 (0%)      0 (0%)           0 (0%)         11m
  kube-system                 coredns-5947f47f5f-t4f89                  100m (5%)     0 (0%)      70Mi (1%)        170Mi (2%)     8m27s
  kube-system                 kube-proxy-4pjhk                          100m (5%)     0 (0%)      0 (0%)           0 (0%)         11m
  1. 為確保終止 node ip-192-168-18-230.eu-west-1.compute.internal,透過以下 AWS CLI 終止 EKS worker node 同時進行 scale-in 至 0。
$ aws autoscaling terminate-instance-in-auto-scaling-group --instance-id i-07e9ea9eb19a67337 --should-decrement-desired-capacity

分析 log

Nodes

使用以下 CloudWatch Logs Insights syntax 分析 nodes,並由 Lambda function 所更新 node:

filter @logStream like /^kube-apiserver-audit/
 | fields objectRef.name, objectRef.resource, @message
 | sort @timestamp asc
 | filter userAgent like 'vpcLambda' AND verb != 'get' AND verb != 'list' AND objectRef.resource == 'nodes'
 | limit 10000

以下擷取部分 logs 較重要部分:

      "userAgent": "vpcLambda/v0.0.0 (linux/amd64) kubernetes/$Format",
      "objectRef": {
        "resource": "nodes",
        "name": "ip-192-168-83-56.eu-west-1.compute.internal",
        "apiVersion": "v1"
      },
      "responseStatus": {
        "metadata": {},
        "code": 200
      },
      "requestObject": {
        "metadata": {
          "labels": {
            "node.kubernetes.io/exclude-from-external-load-balancers": "true"
          }
        }
      },
	  ...
	  ...
	  "requestReceivedTimestamp": "2022-09-25T13:58:43.356273Z",
    "stageTimestamp": "2022-09-25T13:58:43.375313Z",
      "userAgent": "vpcLambda/v0.0.0 (linux/amd64) kubernetes/$Format",
      "objectRef": {
        "resource": "nodes",
        "name": "ip-192-168-83-56.eu-west-1.compute.internal",
        "apiVersion": "v1"
      },
      "responseStatus": {
        "metadata": {},
        "code": 200
      },
      "requestObject": {
        "spec": {
          "unschedulable": true
        }
      },
      ...
      ...
      "requestReceivedTimestamp": "2022-09-25T13:58:43.939359Z",
      "stageTimestamp": "2022-09-25T13:58:43.947171Z",

由上述 logs 可以確認當 Nodes 終止時,EKS 會替 Node 更新以下兩點:

  1. 更新 label node.kubernetes.io/exclude-from-external-load-balancers,其用途是將 cloud-provider 所建立的 loadbalacer 排除 node。此 feature gate ServiceNodeExclusion[1] 於 Kubernetes 1.21 版本起 GA 預設為 enable。
  2. 更新 node 為 Unschedulable ture。

Pods

使用以下 CloudWatch Logs Insights syntax 分析 nodes,並由 Lambda function 所更新 pods:

filter @logStream like /^kube-apiserver-audit/
 | fields objectRef.name, objectRef.resource, @message
 | sort @timestamp asc
 | filter userAgent like 'vpcLambda' AND verb != 'get' AND verb != 'list' AND objectRef.resource == 'pods'
 | limit 10000

以下擷取部分 logs 較重要部分:

      "userAgent": "vpcLambda/v0.0.0 (linux/amd64) kubernetes/$Format",
      "objectRef": {
        "resource": "pods",
        "namespace": "demo",
        "name": "nginx-demo-deployment-7587998bf4-dkd5t",
        "apiVersion": "v1",
        "subresource": "eviction"
      },
      "responseStatus": {
        "metadata": {},
        "status": "Success",
        "code": 201
      },
      "requestObject": {
        "kind": "Eviction",
        "apiVersion": "policy/v1beta1",
        "metadata": {
          "name": "nginx-demo-deployment-7587998bf4-dkd5t",
          "namespace": "demo",
          "creationTimestamp": null
        }
      },
      ...
      ...
      "requestReceivedTimestamp": "2022-09-25T21:28:10.861815Z",
      "stageTimestamp": "2022-09-25T21:28:10.883851Z",
      "userAgent": "vpcLambda/v0.0.0 (linux/amd64) kubernetes/$Format",
      "objectRef": {
        "resource": "pods",
        "namespace": "kube-system",
        "name": "coredns-5947f47f5f-t4f89",
        "apiVersion": "v1",
        "subresource": "eviction"
      },
      "responseStatus": {
        "metadata": {},
        "status": "Success",
        "code": 201
      },
      "requestObject": {
        "kind": "Eviction",
        "apiVersion": "policy/v1beta1",
        "metadata": {
          "name": "coredns-5947f47f5f-t4f89",
          "namespace": "kube-system",
          "creationTimestamp": null
        }
      },
      ...
      ...
      "requestReceivedTimestamp": "2022-09-25T21:28:10.995432Z",
      "stageTimestamp": "2022-09-25T21:28:11.015341Z",

由上述 logs 可以確認當 Nodes 終止時,EKS 會替 Pods 進行 eviction。原先執行於 Node ip-192-168-18-230.eu-west-1.compute.internal 上有四個 Pod,其中被執行 eviction 為 Deployment 的 CoreDNS 及 Nginx Pod。

為什麼不是使用 eksctl 命令 scale in 而是使用 ASG 命令也可以與 managed node group 互通 ?

在上述 Auto Scaling Group scale in 之後,可以透過 eksctl get ng 命令查看 node group desired capacity 成功設置為 0。

$ eksctl get ng --cluster=ironman-2022
CLUSTER NODEGROUP       STATUS          CREATED                 MIN SIZE        MAX SIZE        DESIRED CAPACITY        INSTANCE TYPE   IMAGE ID                ASG NAME    TYPE
ironman-2022 demo-ng         ACTIVE          2022-09-25T12:59:13Z    0               10              0                       m5.large        AL2_x86_64              eks-demo-ng-02c1ba5d-75b6-fbdf-4569-aa31827ad412             managed
ironman-2022 demo-self-ng    CREATE_COMPLETE 2022-09-25T14:03:51Z    1               1               1                       m5.large        ami-08702177b0dcfc054   eksctl-ironman-2022-nodegroup-demo-self-ng-NodeGroup-1Q19K6MP0HTFJ        unmanaged
ironman-2022 ng1-public-ssh  ACTIVE          2022-09-14T09:54:36Z    0               10              0                       m5.large        AL2_x86_64              eks-ng1-public-ssh-62c19db5-f965-bdb7-373a-147e04d9f124              managed

為了確認 node group 與 Auto Scaling Group 關係,以及 managed node group 否做了什麼。因此我們分別透過 EKS 及 CloudFormation 方式查看 Auto Scaling Group 資源。

managed node group

我們可以透過 aws eks describe-nodegroup [2] 命令查看 managed node group 的 Auto Scaling Group 名稱。

$ aws eks describe-nodegroup --cluster=ironman-2022 --nodegroup-name=demo-ng --query nodegroup.resources.autoScalingGroups[0].name
"eks-demo-ng-02c1ba5d-75b6-fbdf-4569-aa31827ad412"
$ aws autoscaling describe-lifecycle-hooks --auto-scaling-group-name eks-demo-ng-02c1ba5d-75b6-fbdf-4569-aa31827ad412
{
    "LifecycleHooks": [
        {
            "LifecycleHookName": "Launch-LC-Hook",
            "AutoScalingGroupName": "eks-demo-ng-02c1ba5d-75b6-fbdf-4569-aa31827ad412",
            "LifecycleTransition": "autoscaling:EC2_INSTANCE_LAUNCHING",
            "NotificationTargetARN": "arn:aws:sns:eu-west-1:346952985317:eks-asg-lifecycle-hook-topic",
            "HeartbeatTimeout": 1800,
            "GlobalTimeout": 172800,
            "DefaultResult": "CONTINUE"
        },
        {
            "LifecycleHookName": "Terminate-LC-Hook",
            "AutoScalingGroupName": "eks-demo-ng-02c1ba5d-75b6-fbdf-4569-aa31827ad412",
            "LifecycleTransition": "autoscaling:EC2_INSTANCE_TERMINATING",
            "NotificationTargetARN": "arn:aws:sns:eu-west-1:346952985317:eks-asg-lifecycle-hook-topic",
            "HeartbeatTimeout": 1800,
            "GlobalTimeout": 172800,
            "DefaultResult": "CONTINUE"
        }
    ]
}

self-managed

$ aws cloudformation list-stack-resources --stack-name eksctl-ironman-2022-nodegroup-demo-self-ng
...
...
        {
            "LogicalResourceId": "NodeGroup",
            "PhysicalResourceId": "eksctl-ironman-2022-nodegroup-demo-self-ng-NodeGroup-1Q19K6MP0HTFJ",
            "ResourceType": "AWS::AutoScaling::AutoScalingGroup",
            "LastUpdatedTimestamp": "2022-09-25T14:07:16.122000+00:00",
            "ResourceStatus": "CREATE_COMPLETE",
            "DriftInformation": {
                "StackResourceDriftStatus": "NOT_CHECKED"
            }
        },
...
$ aws autoscaling describe-lifecycle-hooks --auto-scaling-group-name eksctl-ironman-2022-nodegroup-demo-self-ng-NodeGroup-1Q19K6MP0HTFJ
{
    "LifecycleHooks": []
}

由上述比對兩組 Auto Scaling Group,可以確定 EKS managed node group 額外設定了 Auto Scaling lifecycle hooks[4]。而 lifecycle hook 主要目的是協助建立 Auto Scaling group 在對應的生命週期事件發生時,可以額外定義觸發相應動作。由上述輸出我們可以確認 EKS 在 scale-in 及 scale-out 事件皆會通知 SNS arn:aws:sns:eu-west-1:346952985317:eks-asg-lifecycle-hook-topic

  • autoscaling:EC2_INSTANCE_LAUNCHING:此代表 scale-out 事件
  • autoscaling:EC2_INSTANCE_TERMINATING :此代表 scale-in 事件

又基於先前 audit log 多數 user agent 皆是 VPC Lambda,不難想像此 SNS 則會接續叫用 Lambda function 來設定我們所觀察到更新 node label、Pod eviction 等行為。 然而非常遺憾的是,後續 SNS topic 並不屬於客戶端帳號,因此本文也無法繼續查看對應的 Lambda function 內容。

結論

EKS managed node group 透過 Auto Scaling lifecycle hooks[4] 方式更新 node 並讓 Pod eviction,相較於 self-managed node group 在 scale-in 或 scale-out 皆需要手動透過 kubectl coredon 命令確實相較方便。


上述資訊透過 EKS 所提供 Logs 來驗證上游 Kubernetes 運作原理,倘若上述內文有所錯誤,隨時可以留言或是私訊我。

參考文件

  1. Feature Gates Kubernetes - https://kubernetes.io/docs/reference/command-line-tools-reference/feature-gates/
  2. aws eks describe-nodegroup - https://docs.aws.amazon.com/cli/latest/reference/eks/describe-nodegroup.html
  3. aws cloudformation list-stack-resources - https://docs.aws.amazon.com/cli/latest/reference/cloudformation/list-stack-resources.html
  4. Amazon EC2 Auto Scaling lifecycle hooks - https://docs.aws.amazon.com/autoscaling/ec2/userguide/lifecycle-hooks.html

上一篇
[10] 為什麼 Managed node groups 可以保持應用程式可用性(一)
下一篇
[12] 為什麼 CoreDNS 可以解析 VPC endpoint 及外部 DNS
系列文
那些文件沒告訴你的 AWS EKS30
圖片
  直播研討會
圖片
{{ item.channelVendor }} {{ item.webinarstarted }} |
{{ formatDate(item.duration) }}
直播中

尚未有邦友留言

立即登入留言